Notes:

Summary plots - Sample strategy is on the y-axis and number of sites is on the x-axis. Each plot is paired by parameter level vertically and the values in the cells are the mean value across all of the simulations for that parameter level. Note that each average encompasses all of the other varying simulation parameters.

Full plots - Sample strategy is on the y-axis and number of sites is on the x-axis. Each plot represents a unique simulation and the values in the cells are the mean value across all of the 10 iterations of that simulation across all three unique landscape seeds (i.e., all three sets of Neutral Landscape Models)

K - number of latent factors used in LFMM

TPR - True Positive Rate

FDR - False Discovery Rate

lasso vs ridge- “lasso” and “ridge” are different methods utilized by LFMM that have different penalization functions (Caye et al., 2019)

pRDA - partial RDA conditioning on two PC axes to control for population genetic structure


1. LFMM

1.1 Individual sampling

1.1.1 Summary plots

1.1.2 Linear mixed effects models

Only results from the ridge method are used in the final models

TPR

Linear mixed effect model
statistic ~ nsamp + sampstrat + K + m + phi + H + r + (1 | seed)
Predictors Fixed Effects Sum Sq Mean Sq NumDF DenDF F value Pr(>F)
nsamp 0.0002 3.8849 3.8849 1 15.3180K 374.16691 2.2 × 10−82***
sampstrat 0.1243 0.8737 0.2912 3 15.3180K 28.04836 4.4 × 10−18***
K 0.0338 4.3713 4.3713 1 15.3180K 421.01179 2.6 × 10−92***
m 0.0358 4.9147 4.9147 1 15.3180K 473.35104 2.2 × 10−103***
phi 0.1114 47.5572 47.5572 1 15.3180K 4580.39346 0.0***
H 0.1150 50.6577 50.6577 1 15.3180K 4879.02098 0.0***
r −0.0183 1.2795 1.2795 1 15.3180K 123.23782 1.6 × 10−28***
*** p < 0.001
Tukey test
pairwise ~ sampstrat
Contrast Estimate SE Z ratio p
EG - G −0.0014 0.0023 −0.5948 0.9336946
EG - R 0.0030 0.0023 1.2961 0.5653833
EG - T*** 0.0176 0.0023 7.5567 2.7 × 10−13***
G - R 0.0044 0.0023 1.8908 0.2319131
G - T*** 0.0190 0.0023 8.1508 5.0 × 10−14***
R - T*** 0.0146 0.0023 6.2616 2.3 × 10−9***
*** p < 0.001

FDR

Linear mixed effect model
statistic ~ nsamp + sampstrat + K + m + phi + H + r + (1 | seed)
Predictors Fixed Effects Sum Sq Mean Sq NumDF DenDF F value Pr(>F)
nsamp −0.0027 549.7338 549.7338 1 15.3180K 3196.9834256 0.0***
sampstrat 0.8382 16.4335 5.4778 3 15.3180K 31.8563530 1.6 × 10−20***
K −0.0468 8.4059 8.4059 1 15.3180K 48.8845389 2.8 × 10−12***
m −0.0687 18.1105 18.1105 1 15.3180K 105.3216241 1.2 × 10−24***
phi −0.0317 3.8567 3.8567 1 15.3180K 22.4287145 2.2 × 10−6***
H 0.0703 18.9143 18.9143 1 15.3180K 109.9965860 1.2 × 10−25***
r −0.0029 0.0324 0.0324 1 15.3180K 0.1886837 0.66
*** p < 0.001
Tukey test
pairwise ~ sampstrat
Contrast Estimate SE Z ratio p
EG - G 0.0073 0.0095 0.7712 0.86748324
EG - R*** −0.0253 0.0095 −2.6713 0.03788814***
EG - T** −0.0763 0.0095 −8.0517 5.6 × 10−14**
G - R* −0.0326 0.0095 −3.4424 3.2 × 10−3*
G - T** −0.0836 0.0095 −8.8220 3.0 × 10−14**
R - T** −0.0510 0.0095 −5.3819 4.4 × 10−7**
*** p < 0.05
** p < 0.001
* p < 0.01

1.1.3 Full plots

K

TPR

FDR

Total number of loci

1.2 Site sampling

1.2.1 Summary plots

1.1.2 Linear mixed effects models

Only results from the ridge method are used in the final models

TPR

Linear mixed effect model
statistic ~ nsamp + sampstrat + K + m + phi + H + r + (1 | seed)
Predictors Fixed Effects Sum Sq Mean Sq NumDF DenDF F value Pr(>F)
nsamp −0.0004 0.0593 0.0593 1 8.6090K 8.070827 4.5 × 10−3***
sampstrat 0.0362 0.1180 0.0590 2 8.6090K 8.023909 3.3 × 10−4**
K 0.0160 0.5507 0.5507 1 8.6090K 74.893340 5.9 × 10−18**
m 0.0417 3.7529 3.7529 1 8.6090K 510.419793 7.7 × 10−110**
phi 0.0536 6.1900 6.1900 1 8.6090K 841.869068 1.1 × 10−176**
H 0.0741 11.8217 11.8217 1 8.6090K 1607.807275 1.7 × 10−322**
r −0.0213 0.9766 0.9766 1 8.6090K 132.822293 1.7 × 10−30**
*** p < 0.01
** p < 0.001
Tukey test
pairwise ~ sampstrat
Contrast Estimate SE Z ratio p
EG - EQ 0.0043 0.0023 1.8813 0.14407422
EG - R*** 0.0091 0.0023 4.0035 1.8 × 10−4***
EQ - R 0.0048 0.0023 2.1241 0.08499688
*** p < 0.001

FDR

Linear mixed effect model
statistic ~ nsamp + sampstrat + K + m + phi + H + r + (1 | seed)
Predictors Fixed Effects Sum Sq Mean Sq NumDF DenDF F value Pr(>F)
nsamp −0.0083 25.3129 25.3129 1 8.6110K 842.37960203 8.7 × 10−177***
sampstrat 1.0906 0.0540 0.0270 2 8.6110K 0.89778252 0.410
K 0.0096 0.1966 0.1966 1 8.6110K 6.54282626 0.011**
m 0.0161 0.5617 0.5617 1 8.6110K 18.69390804 1.6 × 10−5***
phi −0.0238 1.2192 1.2192 1 8.6110K 40.57304948 2.0 × 10−10***
H 0.0009 0.0018 0.0018 1 8.6110K 0.06058928 0.810
r 0.0036 0.0281 0.0281 1 8.6110K 0.93373450 0.330
*** p < 0.001
** p < 0.05
Tukey test
pairwise ~ sampstrat
Contrast Estimate SE Z ratio p
EG - EQ −0.0061 0.0046 −1.3303 0.3782653
EG - R −0.0024 0.0046 −0.5263 0.8584367
EQ - R 0.0037 0.0046 0.8037 0.7007541

1.2.3 Full plots

K

TPR

FDR

Total number of loci

1.3 Latent factor test

To determine the effect of K-selection on performance we ran LFMM using a constant K for all of the subsampled datasets from the same simulation (i.e., all sample sizes and strategies had the same K) and compared that to K-selection based on each sub-sampled dataset (i.e., K was allowed to vary by sample size and strategy). The constant K for each simulation was selected using a “full” dataset of 2000 randomly selected individuals (i.e. much greater than 10% of the total population).

We compared the results using the same statistics (i.e., TPR and FDR) and linear mixed effects models. Overall, the results did not vary substantially between constant and variable K. The only notable difference was that under individual based sampling migration and population size had a negative effect on FDR when a variable K was used and a positive effect on K when a constant K value was used. Since migration and population size have the biggest effect on underlying population structure and thereby K-selection, this makes sense. Smaller population sizes and weaker migration will result in greater population structure.

For constant K, this is reflected in the increased value for K identified using the Tracy Widom test when migration and population size are lower. For variable K, a similar effect is observed, however it is confounded by the effect of sample number and sample strategy. In general, larger sample sizes and transect sampling schemes had relatively larger K values compared to the other sampling strategies. Going back to FDR, it is clear that under individual based sampling with weak migration and small population size (i.e., stronger population structure), transect sampling has higher FDR when variable K is used compared to constant K. Altogether this shows how population genetic structure and sampling structure affect K selection and the ultimate performance of LFMM. We recommend that a range of K values be evaluated when performing LFMM, with particular care taken under transect sampling schemes.

Individual sampling

TPR

Linear mixed effect model
statistic ~ nsamp + sampstrat + K + m + phi + H + r + (1 | seed)
Predictors Fixed Effects Sum Sq Mean Sq NumDF DenDF F value Pr(>F)
nsamp 0.0002 2.2592 2.2592 1 15.3180K 284.911850 2.4 × 10−63***
sampstrat 0.0995 0.2096 0.0699 3 15.3180K 8.812327 7.8 × 10−6***
K 0.0231 2.0491 2.0491 1 15.3180K 258.413832 1.1 × 10−57***
m 0.0550 11.5739 11.5739 1 15.3180K 1459.604079 3.9 × 10−305***
phi 0.0796 24.2608 24.2608 1 15.3180K 3059.565188 0.0***
H 0.0814 25.3972 25.3972 1 15.3180K 3202.871668 0.0***
r −0.0148 0.8346 0.8346 1 15.3180K 105.249242 1.3 × 10−24***
*** p < 0.001
Tukey test
pairwise ~ sampstrat
Contrast Estimate SE Z ratio p
EG - G −0.0017 0.0020 −0.8564 0.8272429
EG - R 0.0008 0.0020 0.3925 0.9794972
EG - T*** 0.0080 0.0020 3.9129 5.3 × 10−4***
G - R 0.0025 0.0020 1.2489 0.5956070
G - T*** 0.0097 0.0020 4.7687 1.1 × 10−5***
R - T** 0.0072 0.0020 3.5208 2.4 × 10−3**
*** p < 0.001
** p < 0.01

FDR

Linear mixed effect model
statistic ~ nsamp + sampstrat + K + m + phi + H + r + (1 | seed)
Predictors Fixed Effects Sum Sq Mean Sq NumDF DenDF F value Pr(>F)
nsamp −0.0025 501.9981 501.9981 1 15.3180K 3776.700755 0.0***
sampstrat 0.5655 14.8749 4.9583 3 15.3180K 37.302920 5.2 × 10−24***
K 0.0220 1.8480 1.8480 1 15.3180K 13.902783 1.9 × 10−4***
m 0.0711 19.3767 19.3767 1 15.3180K 145.777616 2.1 × 10−33***
phi −0.0329 4.1374 4.1374 1 15.3180K 31.127031 2.5 × 10−8***
H 0.0974 36.3261 36.3261 1 15.3180K 273.293640 7.3 × 10−61***
r −0.0122 0.5737 0.5737 1 15.3180K 4.316443 0.038**
*** p < 0.001
** p < 0.05
Tukey test
pairwise ~ sampstrat
Contrast Estimate SE Z ratio p
EG - G 0.0024 0.0083 0.2905 0.991466
EG - R*** −0.0306 0.0083 −3.6725 1.4 × 10−3***
EG - T** −0.0748 0.0083 −8.9768 2.7 × 10−14**
G - R** −0.0330 0.0083 −3.9628 4.3 × 10−4**
G - T** −0.0772 0.0083 −9.2667 4.3 × 10−14**
R - T** −0.0442 0.0083 −5.3064 6.7 × 10−7**
*** p < 0.01
** p < 0.001

Site sampling

TPR

Linear mixed effect model
statistic ~ nsamp + sampstrat + K + m + phi + H + r + (1 | seed)
Predictors Fixed Effects Sum Sq Mean Sq NumDF DenDF F value Pr(>F)
nsamp −0.0018 1.1952 1.1952 1 8.6090K 183.240394 2.5 × 10−41***
sampstrat 0.0191 0.0892 0.0446 2 8.6090K 6.835991 1.1 × 10−3**
K 0.0152 0.4970 0.4970 1 8.6090K 76.192821 3.1 × 10−18***
m 0.0389 3.2589 3.2589 1 8.6090K 499.649532 1.3 × 10−107***
phi 0.0442 4.2123 4.2123 1 8.6090K 645.825647 1.9 × 10−137***
H 0.0656 9.2807 9.2807 1 8.6090K 1422.900812 2.6 × 10−288***
r −0.0206 0.9155 0.9155 1 8.6090K 140.355642 4.0 × 10−32***
*** p < 0.001
** p < 0.01
Tukey test
pairwise ~ sampstrat
Contrast Estimate SE Z ratio p
EG - EQ*** 0.0057 0.0021 2.6691 0.02077436***
EG - R** 0.0076 0.0021 3.5508 1.1 × 10−3**
EQ - R 0.0019 0.0021 0.8835 0.65075396
*** p < 0.05
** p < 0.01

FDR

Linear mixed effect model
statistic ~ nsamp + sampstrat + K + m + phi + H + r + (1 | seed)
Predictors Fixed Effects Sum Sq Mean Sq NumDF DenDF F value Pr(>F)
nsamp −0.0187 129.4249 129.4249 1 8.6090K 1992.658631 0.0***
sampstrat 1.2219 2.3866 1.1933 2 8.6090K 18.372105 1.1 × 10−8***
K 0.0190 0.7755 0.7755 1 8.6091K 11.939477 5.5 × 10−4***
m 0.0196 0.8298 0.8298 1 8.6091K 12.776418 3.5 × 10−4***
phi −0.0191 0.7830 0.7830 1 8.6091K 12.054888 5.2 × 10−4***
H 0.0389 3.2570 3.2570 1 8.6091K 50.145685 1.5 × 10−12***
r 0.0065 0.0916 0.0916 1 8.6090K 1.410227 0.24
*** p < 0.001
Tukey test
pairwise ~ sampstrat
Contrast Estimate SE Z ratio p
EG - EQ*** 0.0301 0.0067 4.4781 2.2 × 10−5***
EG - R −0.0087 0.0067 −1.2978 0.396384
EQ - R*** −0.0388 0.0067 −5.7761 2.3 × 10−8***
*** p < 0.001

2. RDA

2.1 Individual sampling

2.1.1 Summary plots

2.1.2 Linear mixed effects models

Only results from the standard RDA (not the partial RDA) are used in the final models

TPR

Linear mixed effect model
statistic ~ nsamp + sampstrat + K + m + phi + H + r + (1 | seed)
Predictors Fixed Effects Sum Sq Mean Sq NumDF DenDF F value Pr(>F)
nsamp 0.0002 2.8394 2.8394 1 15.3480K 545.389450 1.5 × 10−118***
sampstrat 0.0875 0.3001 0.1000 3 15.3480K 19.211563 2.0 × 10−12***
K 0.0165 1.0439 1.0439 1 15.3480K 200.509665 3.1 × 10−45***
m 0.0438 7.3555 7.3555 1 15.3480K 1412.856406 7.2 × 10−296***
phi 0.0440 7.4213 7.4213 1 15.3480K 1425.494641 2.2 × 10−298***
H 0.0436 7.2900 7.2900 1 15.3480K 1400.274445 2.3 × 10−293***
r 0.0036 0.0488 0.0488 1 15.3480K 9.371404 2.2 × 10−3**
*** p < 0.001
** p < 0.01
Tukey test
pairwise ~ sampstrat
Contrast Estimate SE Z ratio p
EG - G 0.0005 0.0016 0.2768 0.9926010
EG - R 0.0029 0.0016 1.7594 0.2931233
EG - T*** 0.0110 0.0016 6.6817 1.4 × 10−10***
G - R 0.0024 0.0016 1.4826 0.4480091
G - T*** 0.0105 0.0016 6.4050 9.0 × 10−10***
R - T*** 0.0081 0.0016 4.9224 5.1 × 10−6***
*** p < 0.001

FDR

Linear mixed effect model
statistic ~ nsamp + sampstrat + K + m + phi + H + r + (1 | seed)
Predictors Fixed Effects Sum Sq Mean Sq NumDF DenDF F value Pr(>F)
nsamp 0.0003 5.7961 5.7961 1 15.3480K 373.268676 3.4 × 10−82***
sampstrat 0.1162 0.2103 0.0701 3 15.3480K 4.515102 3.6 × 10−3**
K 0.0153 0.9004 0.9004 1 15.3480K 57.987784 2.8 × 10−14***
m 0.0753 21.8007 21.8007 1 15.3480K 1403.969867 4.2 × 10−294***
phi 0.0647 16.0600 16.0600 1 15.3480K 1034.264854 1.2 × 10−219***
H 0.0610 14.2799 14.2799 1 15.3480K 919.628150 3.1 × 10−196***
r 0.0091 0.3176 0.3176 1 15.3480K 20.451012 6.2 × 10−6***
*** p < 0.001
** p < 0.01
Tukey test
pairwise ~ sampstrat
Contrast Estimate SE Z ratio p
EG - G −0.0010 0.0028 −0.3555 0.98460216
EG - R 0.0020 0.0028 0.6959 0.89870855
EG - T*** 0.0085 0.0028 2.9888 0.01487071***
G - R 0.0030 0.0028 1.0514 0.71907362
G - T** 0.0095 0.0028 3.3443 4.6 × 10−3**
R - T 0.0065 0.0028 2.2929 0.09963697
*** p < 0.05
** p < 0.01

1.1.3 Full plots

TPR

FDR

Total number of loci

2.2 Site sampling

2.2.1 Summary plots

2.2.2 Linear mixed effects models

Only results from the standard RDA (not the partial RDA) are used in the final models

TPR

Linear mixed effect model
statistic ~ nsamp + sampstrat + K + m + phi + H + r + (1 | seed)
Predictors Fixed Effects Sum Sq Mean Sq NumDF DenDF F value Pr(>F)
nsamp 0.0015 0.8335 0.8335 1 8.6290K 232.154396 9.5 × 10−52***
sampstrat 0.0717 0.1133 0.0567 2 8.6290K 15.782322 1.4 × 10−7***
K 0.0173 0.6489 0.6489 1 8.6290K 180.732574 8.6 × 10−41***
m 0.0322 2.2403 2.2403 1 8.6290K 623.983518 5.0 × 10−133***
phi 0.0321 2.2322 2.2322 1 8.6290K 621.743005 1.4 × 10−132***
H 0.0315 2.1447 2.1447 1 8.6290K 597.363316 1.3 × 10−127***
r 0.0019 0.0081 0.0081 1 8.6290K 2.261166 0.13
*** p < 0.001
Tukey test
pairwise ~ sampstrat
Contrast Estimate SE Z ratio p
EG - EQ −0.0001 0.0016 −0.0550 0.9983351
EG - R*** 0.0076 0.0016 4.8378 3.9 × 10−6***
EQ - R*** 0.0077 0.0016 4.8928 3.0 × 10−6***
*** p < 0.001

FDR

Linear mixed effect model
statistic ~ nsamp + sampstrat + K + m + phi + H + r + (1 | seed)
Predictors Fixed Effects Sum Sq Mean Sq NumDF DenDF F value Pr(>F)
nsamp 0.0018 1.1681 1.1681 1 8.6290K 140.00975 4.7 × 10−32***
sampstrat 0.0845 0.1085 0.0542 2 8.6290K 6.50242 1.5 × 10−3**
K 0.0133 0.3801 0.3801 1 8.6290K 45.55864 1.6 × 10−11***
m 0.0378 3.0791 3.0791 1 8.6290K 369.07182 1.4 × 10−80***
phi 0.0336 2.4370 2.4370 1 8.6290K 292.10683 2.0 × 10−64***
H 0.0297 1.8994 1.8994 1 8.6290K 227.67214 8.5 × 10−51***
r 0.0144 0.4449 0.4449 1 8.6290K 53.32587 3.1 × 10−13***
*** p < 0.001
** p < 0.01
Tukey test
pairwise ~ sampstrat
Contrast Estimate SE Z ratio p
EG - EQ*** 0.0057 0.0024 2.3728 0.04642683***
EG - R** 0.0085 0.0024 3.5382 1.2 × 10−3**
EQ - R 0.0028 0.0024 1.1654 0.47395300
*** p < 0.05
** p < 0.01

2.2.3 Full plots

TPR

FDR

Total number of loci